perm filename 0[00,BGB]1 blob
sn#043252 filedate 1973-05-22 generic text, type C, neo UTF8
COMMENT ā VALID 00006 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 STANFORD ARTIFICIAL INTELLIGENCE LABORATORY MAY 1973.
C00005 00003 PAge 1. Illustration: 16 frames of a doll baby on a turn table.
C00006 00004 PAGE 2. INTRODUCTION.
C00015 00005 PAGE 3. Illustration: 32 frames of blocks on a turn table.
C00016 00006 PAGE 4. INTRODUCTION.
C00020 ENDMK
Cā;
STANFORD ARTIFICIAL INTELLIGENCE LABORATORY MAY 1973.
MEMO AIM-199
COMPUTER SCIENCE DEPARTMENT REPORT
NO. CS-362
Image Contouring and Comparing.
Bruce g. Baumgart
ABSTRACT:
A contour image representation is stated and an algorithm
for converting a set of digital television images into this
representation is explained. The algorithm consists of five steps:
digital image thresholding, binary image contouring, polygon
nesting, polygon smoothing, and polygon comparing. An
implementation of the algorithm is the main routine of a program
called CRE; auxiliary routines provide cart and turn table control,
TV camera input, image display, and xerox printer output. A
serendip application of CRE to type font contruction is explained.
Details about the intended application of CRE to the perception of
physical objects will appear in sequels to this paper.
CONTENTS:
Introduction.
I. The CRE data structure.
II. The CRE algorithm.
III. Using CRE.
IV. Using TVFONT.
Postscripts.
This research was supported by the Advanced Research Projects Agency
of the Office of the Secretary of Defense under contract SD-183.
PAge 1. Illustration: 16 frames of a doll baby on a turn table.
PAGE 2. INTRODUCTION.
The acronym CRE stands both for "Contour, Region, Edge" and
for "Cart's Eye". CRE is a solution to the problem of finding
contour edges in a set of television pictures and linking
corresponding edges from one picture to the next. The process is
automatic and is intended to run without human intervention.
Furthermore, the process is bottom up; there are no significant
inputs other than the given television images. The output of CRE is
a 2D contour map data structure which is suitable input to a 3D
geometric modeling program.
The overall design goal for CRE was to build a region edge
finding program that could be applied to a sequence of television
pictures and that would output a sequence of line drawings without
having to know anything about the content of the images. Furthermore
it was desired that the line drawings be structured. The six design
choices that determined the character of CRE are:
1. Dumb vision rather than model driven vision.
2. Multi image analysis rather than single image analysis.
3. Total image structure imposed on edge finding; rather
than separate edge finder and image analyzer.
4. Automatic rather than interactive.
5. Fixed image window size rather than variable window size.
6. Machine language rather than higher level language.
The design choices are ordered from the more strategic to
the more tactical; the first three choices being research
strategies, the latter three choices being programming tactics.
Adopting these design choices lead to image contouring and contour
map structures similar to that of Krakauer[3] and Zahn[4].
The first design choice does not refer to the issue of how
model dependent a finished general vision system will be (it will be
quite model dependent), but rather to the issue of where one should
begin building such a system. I believe that the best starting
points are at the two apparent extremes of nearly total knowledge of
a particular visual world or nearly total ignorance. The former
extreme involves synthesis (by computer graphics) of a predicted 2D
image, followed by comparing the predicted and a perceived image for
slight differences which are expected but not yet measured. The
other extreme involves anaylsing perceived images into structures
which can be readily compared (for near equality) and measured for
slight differences; followed by the construction of a 3D geometric
model of the perceived world. The point is that in both cases images
are compared, and in both cases the 3D model initially (or finally)
contains specific numerical data on the geometry and physics of the
particular world being looked at.
PAGE 3. Illustration: 32 frames of blocks on a turn table.
PAGE 4. INTRODUCTION.
The second design choice, of multi image anaylsis rather
than single image analysis, provides a basis for solving for camera
positions and feature depths. The third design choice solves (or
rather avoids) the problem of integrating an edge finder's results
into an image; by using a very simple edge finder and by accepting
all its findings as valid photometric edges that are indeed a part
of the image, the image structure is never lost. This design
postpones the problem of interpreting photometric edges as physical
edges.
The fourth choice is a resolution not to write an image
processor that requires operator assistance and parameter tuning.
The fifth choice of the 216 by 288 fixed window size is a sin that
proved surprisingly expedient, it is explained later. A variable
window version of CRE at halves, thirds and other simple fractions
of its present window size will be made at some future date.
The final design choice of using machine language was for
the sake of implementing node link data structures that are
processed 100 faster than LEAP, 10 times faster than compiled LISP
and that require significantly less memory than similar structures
in either LISP or LEAP. Furthermore machine code assembles and loads
faster than higher level languages; and machine code can be
extensively fixed and altered without recompiling.
It is my impression that CRE does not raise any new
scientific problems; nor does it have any really new solutions to
the old problems; rather CRE is another competent video region edge
finding program with its own set of tricks. However, it is further
my impression that the particular tricks for smoothing, nesting and
comparing polygons in CRE are original as programming techniques.
The intended use of CRE is illustrated by the sequences of
turn table pictures on pages 1 and 3. The first results of applying
CRE to typography is illustrated below:
Christmas Font
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z.
a b c d e f g h i j k l m n o p q r s t u v w x y z.
ACKNOWLEDGEMENT.
Tovar Mock assisted me with the development of the type font
making program, TVFONT.